This sub-chapter shows an analysis of salary for different occupations in New York City, the values represent the salary of corresponding occupation per year.
## ─ Attaching packages ──────────────────── tidyverse 1.3.0 ─
## ✓ ggplot2 3.3.2 ✓ purrr 0.3.4
## ✓ tibble 3.0.4 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.0
## ─ Conflicts ───────────────────── tidyverse_conflicts() ─
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## Registered S3 method overwritten by 'mosaic':
## method from
## fortify.SpatialPolygonsDataFrame ggplot2
##
## The 'mosaic' package masks several functions from core packages in order to add
## additional features. The original behavior of these functions should not be affected by this.
##
## Attaching package: 'mosaic'
## The following object is masked from 'package:Matrix':
##
## mean
## The following objects are masked from 'package:dplyr':
##
## count, do, tally
## The following object is masked from 'package:purrr':
##
## cross
## The following object is masked from 'package:ggplot2':
##
## stat
## The following objects are masked from 'package:stats':
##
## binom.test, cor, cor.test, cov, fivenum, IQR, median, prop.test,
## quantile, sd, t.test, var
## The following objects are masked from 'package:base':
##
## max, mean, min, prod, range, sample, sum
## Loading required package: grid
##
## Attaching package: 'vcd'
## The following object is masked from 'package:mosaic':
##
## mplot
##
## Attaching package: 'RCurl'
## The following object is masked from 'package:tidyr':
##
## complete
In order to have an overview of salary distribution according to different occupations in New York City, we draw a Cleveland Dot Plot to show the 10-year-average salary of different occupations first. As can be seen from the result, there is a huge difference in salary for different kinds of occupations. The range is up to 72947, which is about 3 times of minimum salary.
## Top 3 and Last 3 Occupations in Salary
Usually, we tend to think salaries should have increasing trends by years. From the plot, however, we can discover the salaries do not have an increasing trend for all types of occupations. Among all types of occupations, only two types of occupations have lower salaries in the year range from 2017 to 2019 than in the year range from 2010 to 2013, namely, farming, fishing and forestry occupations and healthcare support occupations. Generally speaking, there are two different trends of salaries, namely, a monotonous increasing trend, which includes 18 occupations, and the trend that first decreases then increases, which includes 7 occupations.
There are kinds of occupations in this trend group. 1. Legal occupations 2. Health diagnosing and treating practitioners and other technical occupations 3. Computer and mathematical occupations 4. Management occupations 5. Business and financial operations occupations 6. Architecture and engineering occupations 7. Arts, design, entertainment, sports, and media occupations 8. Education, training, and library occupations 9. Installation, maintenance, and repair occupations 10. Community and social service occupations 11. Construction and extraction occupations 12. Office and administrative support occupations 13. Sales and related occupations 14. Transportation occupations 15. Production occupations 16. Building and grounds cleaning and maintenance occupations 17. Material moving occupations 18. Food preparation and serving related occupations
In this category, the salaries of some occupations decreased a lot and then increased a little, which makes these categories have a decreasing trend in general. However, for other occupations, the salaries decreased a little first and then increased a lot. For these occupations, they are in an increasing trend in general. To learn more about the exact changing trends of salaries in this group, we draw scatter plots to analyze the trends of these occupations in detail.
From the above plots, we see both the variations in salaries and the general trends for these occupations. 1. Law enforcement workers including supervisors For this occupation, the salary is in the trend of a wave. The crests are in year 2011, year 2015, and year 2019. The troughs are in year 2012 and year 2016. Besides, there are two special points for this occupation. a) The salary dropped a lot from 2015 to 2016, and then returned to normal quickly from 2016 to 2017. b) The salary was in an decreasing trend from 2017 to 2018, but it did not continue to decrease, instead, it increased a lot from year 2018 to 2019. 2. Life, physical, and social science occupations For this occupation, the salary is in also in the trend of a wave. At the same time, it is in an increasing trend in general. The crests are in year 2013 and 2018. The troughs are in year 2012 and 2014. 3. Health technologists and technicians From year 2010 to 2013, the salary of this occupation grew slightly at a steady rate. However, from 2014 to 2015, the salary had a sudden drop. After that, the salary started to increase at a higher rate. 4. Fire fighting and prevention, and other protective service workers including supervisors The salary of this group is in a waving trend and remains at a certain level in general. The crests are in 2012, 2013 and 2017, and the troughs are in 2011 and 2016. However, there is a special point for this occupation. a) The salary had a sudden increase from year 2016 to year 2017. 5. Healthcare support occupations Generally speaking, the salary trend is in a decreasing trend. The crests occurred in year 2010, 2013, and 2015. The troughs occured in year 2014 and 2017. 6. Personal care and service occupations Generally speaking, the salary of this occupation is in an increasing trend. It remained relatively stable before 2015, after that, the salary increased at a relatively high speed. 7. Farming, fishing, and forestry occupations For this occupation, it salary increased a little from 2011 to 2012, and then began to decrease at a high speed from 2012 to 2014. After that, the salary recovered at a lower but steady speed.
It is also very important to analyze on the variations of salaries of different occupations. Because different occupations have different base wages, sometimes it might be more meaningful to calculate the percentage of wage fluctuations in wages. Here, we use the average wages to represent the wage of different occupations.
##
## ─ Column specification ────────────────────────────
## cols(
## Occupations = col_character(),
## variance = col_double(),
## year = col_character(),
## Salary_YearlyAvg = col_double()
## )
As we can see from the above plot, we can discover that the majority of these occupations have positive variances in the past decade. Only two of these categories have negative variances. Among all occupations, the occupation of Construction and extraction occupations has the biggest variation in salary from 2010 to 2019, and the occupation of Healthcare support occupations has smallest variation in salary from 2010 to 2019.
In order to see the salary variances of the 25 occupations in detail, we draw boxplotx to make comparisions.
As can be seen in this plot, for different occupations, the counties with the highest and lowest wages in each occupation are different. For the majority of the occupations, the highest salaries are in New York County and their lowest salaries are in Bronx County. The detailed distribution is shown in the following table statistics.
| Boroughs | With the lowest wage | With the highest wage |
|---|---|---|
| Bronx County | Legal occupations, Health diagnosing and treating practitioners and other technical occupations, Computer and mathematical occupations, Management occupations, Business and financial operations occupations, Architecture and engineering occupations, Life, physical, and social science occupations, Arts, design, entertainment, sports, and media occupations, Health technologists and technicians, Education, training, and library occupations, Installation, maintenance, and repair occupations, Community and social service occupations, Construction and extraction occupations, Office and administrative support occupations, Sales and related occupations, Transportation occupations, Fire fighting and prevention, and other protective service workers including supervisors, Production occupations, Building and grounds cleaning and maintenance occupations, Personal care and service occupations, Food preparation and serving related occupations | |
| Kings County | Law enforcement workers including supervisors | |
| New York County | Healthcare support occupations, Material moving occupations, Farming, fishing, and forestry occupations | Legal occupations, Health diagnosing and treating practitioners and other technical occupations, Computer and mathematical occupations, Management occupations, Arts, design, entertainment, sports, and media occupations, Health technologists and technicians, Education, training, and library occupations, Community and social service occupations, Office and administrative support occupations, Sales and related occupations, Food preparation and serving related occupations |
| Queens County | Farming, fishing, and forestry occupations | |
| Richmond County | Law enforcement workers including supervisors, Business and financial operations occupations, Architecture and engineering occupations, Life, physical, and social science occupations, Installation, maintenance, and repair occupations, Construction and extraction occupations, Transportation occupations, Fire fighting and prevention, and other protective service workers including supervisors, Production occupations, Building and grounds cleaning and maintenance occupations, Material moving occupations, Personal care and service occupations |
As can be seen in the above table, we discover the following characteristics. 1) The majority of the highest salaries occur in New York County, many of them also appears in Richmond County, one of them appear in Queens County and none of them appear in Kings County or Bronx County. 2) Most of the lowest salaries appear in Bronx County. There are also several occupations with the lowest salaries in New York County and one occupation with its lowest salary in Kings County, which is the occupation of Law enforcement workers including supervisors. 3) The occupations with the lowest salaries in New York County tend to be the occupations with relatively low incomes.
Next, we draw a bar chart to reflect the specific distribution data of the highest and lowest wages in different counties (the average of all years).
## `summarise()` regrouping output by 'Boroughs' (override with `.groups` argument)
## Distribution of the Highest and Lowest Wages in Different Counties by Years To see whether there are changes of the distribution of highest and lowest wages in different countries, we draw a stacked bar chart by years. We use different colors to represent different counties.
## `summarise()` regrouping output by 'index', 'year' (override with `.groups` argument)
As can be seen in this plot, from the perspective of each year alone, the situation is slightly different from the overall average, which is reflected in the following aspects.
In the overall trend, maximum salary for all occupations do not lie in Bronx County and Kings County. However, as can be seen from the stacked bar chart, in year 2010, 2012, 2013 and 2014, there are some occupations with highest salary in Bronx County. Also, except for year 2014 and year 2017, there are some occupations with highest salary in Kings County. ### For the min salary In the overall trend, minimum salary for all occupations od not lie in Queens County and Richmond County. However, as can be seen from the stacked bar chart, except for year 2013, there are some occupations with lowest salary in Queens County. Also, except for year 2018, there are some occupations with lowest salary in Richmond County.
We also discover that the variations among different boroughs for different occupations are different. Therefore, we use a bar chart to order the degree of variance among different boroughs for all types of occupations. For each occupation, we use the salary in five counties to minus the smallest salary, add them up and divide the sum by 5. Then, we divide the value by the smallest salary to represent the variance of each occupation.
### Top 5 in Variation 1. Sales and related occupation 2. Legal occupations 3. Management occupations 4. Farming, fishing, and forestry occupations 5. Arts, design, entertainment, sports, and media occupations ### Last 5 in Variation 1. Personal care and service occupations 2. Health technologists and technicians 3. Community and social service occupations 4. Food preparation and serving related occupations 5. Life, physical, and social science occupations
As can be seen in this Cleveland dot plot, the salaries of some occupations varies a lot between different genders, while some other occupations have similar salaries for two genders. Also, for some kinds of occupations, male have higher salaries and for other kinds of occupations, woman have higher salaries. To have a deeper understanding of these characteristics, we have a deeper analysis on salaries for different genders in different occupations. ## Salaries Variance between Genders We use a bar chart to order the salary variance between genders for different occupations. To quantify the difference, we divide the income difference between male and female by the average salary of the occupation.
## General Characteristics of Salaries Variance between Genders From the horizontal bar chart above, we discover the following characteristics. For most of the occupations, male employees have higher salaries than female employees. Female employees only have higher salaries in 4 kinds of occupations among the 25 kinds of occupations, namely, Construction and extraction occupations, Installation, maintenance, and repair occupations, Community and social service occupations, and Transportation occupations.
We also want to analyze about the stability of the distinction between genders itself, to see if there is a relation between the size and the stability of the variance.
However, there’s no clear connections between the size and stability of variance.
Intuitively, the gender composition of employees in a profession is related to the level of wages for changing gender. We want to analyze if this intuition makes sense. THerefore, we use two categorical variables to represent the two characteristics, namely “Gender Distribution” and “Salary Distribution”. For the category of “Gender Distribution”, there are two values, Male-dominated, which means there are more male employees in this occupation than female employees, and Female-dominated, which means there are more female employees in this occupation than male employees. For the category of “Salary Distribution”, we also set two values, Male-higher, which means male employees have higher salary in this occupation, and Female-higher, which means female employees have higher salary in this occupation. Then, we draw a mosaic plot to measure the relation. From this mosaic plot, we can see that salary distribution is related to gender composition. However, the characteristic of this connection is against tuition. We tend to think that in the “Female-higher” salary distribution group, there will be more female-dominated occupations, and in the “Male-higher” salary distribution group, there will be more male-dominated occupations. However, the conclusion from the plot is opposite against our tuition.